Modulos EDA

In [1]:
import pandas as pd
In [3]:
df = pd.read_csv('https://raw.githubusercontent.com/rromanss23/Machine_Leaning_Engineer_Udacity_NanoDegree/master/projects/boston_housing/housing.csv')

I - Pandas Profiling

In [8]:
# !pip install pandas_profiling
In [9]:
# Importamos el módulo
from pandas_profiling import ProfileReport
In [23]:
# Generamos el reporte
profile = ProfileReport(df, title='Boston house pricing')
In [11]:
#Mostramos el reporte
# profile.to_widgets()
profile.to_file("Pandas_Profile.html")

II - Sweetviz

In [13]:
# !pip install sweetviz
In [14]:
# Importamos el módulo
import sweetviz as sv
In [15]:
# Generamos el reporte
my_report = sv.analyze(df)
In [16]:
# El reporte se puede exportar a HTML o previsualizarlo en el notebook:

my_report.show_html()                # Exporta a HTML
# my_report.show_notebook()              # Previasualiza en el notebook
Report SWEETVIZ_REPORT.html was generated! NOTEBOOK/COLAB USERS: the web browser MAY not pop up, regardless, the report IS saved in your notebook/colab files.

DataPrep

In [20]:
#! pip install dataprep
In [21]:
from dataprep.eda import create_report
In [26]:
report = create_report(df, title = 'Boston house pricing')
/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/dask/core.py:119: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))
In [28]:
report.save('dataprep_report')
Report has been saved to dataprep_report.html!

Autoviz

In [35]:
#!pip install autoviz
In [37]:
from autoviz.AutoViz_Class import AutoViz_Class
AV = AutoViz_Class()
df = AV.AutoViz('https://raw.githubusercontent.com/rromanss23/Machine_Leaning_Engineer_Udacity_NanoDegree/master/projects/boston_housing/housing.csv')
Shape of your Data Set loaded: (489, 4)
#######################################################################################
######################## C L A S S I F Y I N G  V A R I A B L E S  ####################
#######################################################################################
Classifying variables in data set...
Data cleaning improvement suggestions. Complete them before proceeding to ML modeling.
  Nuniques dtype Nulls Nullpercent NuniquePercent Value counts Min Data cleaning improvement suggestions
LSTAT 442 float64 0 0.000000 90.388548 0
RM 430 float64 0 0.000000 87.934560 0
MEDV 228 float64 0 0.000000 46.625767 0
PTRATIO 44 float64 0 0.000000 8.997955 0
    4 Predictors classified...
        No variables removed since no ID or low-information variables found in data set
Number of All Scatter Plots = 10
No categorical or numeric vars in data set. Hence no bar charts.
All Plots done
Time to run AutoViz = 2 seconds 

 ###################### AUTO VISUALIZATION Completed ########################
In [ ]:
 
In [ ]:
 
In [42]:
#!pip install jupyter_contrib_nbextensions
In [46]:
!jupyter nbconvert --to html PaquetesEDA.ipynb
[NbConvertApp] Converting notebook PaquetesEDA.ipynb to html
Traceback (most recent call last):
  File "/home/mato/jupyter/jupyterenv/bin/jupyter-nbconvert", line 8, in <module>
    sys.exit(main())
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/jupyter_core/application.py", line 269, in launch_instance
    return super().launch_instance(argv=argv, **kwargs)
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/traitlets/config/application.py", line 976, in launch_instance
    app.start()
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 423, in start
    self.convert_notebooks()
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 597, in convert_notebooks
    self.convert_single_notebook(notebook_filename)
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 560, in convert_single_notebook
    output, resources = self.export_single_notebook(
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/nbconvertapp.py", line 488, in export_single_notebook
    output, resources = self.exporter.from_filename(
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/exporters/exporter.py", line 189, in from_filename
    return self.from_file(f, resources=resources, **kw)
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/exporters/exporter.py", line 206, in from_file
    return self.from_notebook_node(
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/exporters/html.py", line 223, in from_notebook_node
    return super().from_notebook_node(nb, resources, **kw)
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/nbconvert/exporters/templateexporter.py", line 413, in from_notebook_node
    output = self.template.render(nb=nb_copy, resources=resources)
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/jinja2/environment.py", line 1291, in render
    self.environment.handle_exception()
  File "/home/mato/jupyter/jupyterenv/lib/python3.10/site-packages/jinja2/environment.py", line 925, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "/home/mato/.local/share/jupyter/nbconvert/templates/lab/index.html.j2", line 3, in top-level template code
    {% from 'jupyter_widgets.html.j2' import jupyter_widgets %}
  File "/home/mato/.local/share/jupyter/nbconvert/templates/lab/base.html.j2", line 260, in top-level template code
    {% set div_id = uuid4() %}
  File "/home/mato/.local/share/jupyter/nbconvert/templates/base/display_priority.j2", line 1, in top-level template code
    {%- extends 'base/null.j2' -%}
  File "/home/mato/.local/share/jupyter/nbconvert/templates/base/null.j2", line 110, in top-level template code
    {%- block footer -%}
  File "/home/mato/.local/share/jupyter/nbconvert/templates/lab/index.html.j2", line 157, in block 'footer'
    {{ super() }}
  File "/home/mato/.local/share/jupyter/nbconvert/templates/lab/base.html.j2", line 302, in block 'footer'
    {{ nb.metadata.widgets[mimetype] | json_dumps | escape_html_script }}
jinja2.exceptions.TemplateRuntimeError: No filter named 'escape_html_script' found.
In [ ]: